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Abstract —We study the existence or absence of 
non-Shannon inequalities for variables that are re¬ 
lated by functional dependencies. Although the power- 
set on four variables is the smallest Boolean lattice 
with non-Shannon inequalities there exist lattices with 
many more variables without non-Shannon inequali¬ 
ties. We search for conditions that ensures that no 
non-Shannon inequalities exist. It is demonstrated that 
3-dimensional distributive lattices cannot have non- 
Shannon inequalities and planar modular lattices can¬ 
not have non-Shannon inequalities. The existence of 
non-Shannon inequalities is related to the question of 
whether a lattice is isomorphic to a lattice of subgroups 
of a group. 

Index Terms —Functional dependence, lattice, mod¬ 
ularity, non-Shannon inequality, polymatroid. 

I. Introduction 

The existence of non-Shannon inequalities have got a 
lot of attention since the first inequality was discovered 
by Z. Zhang and R. W. Yeung [T]. The basic observation 
is that any four random variables A, B, C, and D satisfy 
the following inequality 

2I{C;D) < 

I {A; B)+I{A;C\i)D)+ 3/ {C;D\A)+I (C; D \ B) 

( 1 ) 

where I {A, B) denotes the mutual information between 
A and B and I {C; D \ B) denotes the conditional mutual 
information between C and D given B. Finally, C l±l I? 
here denotes the random variable that takes values of the 
form (c, d) where c = C and d = D. The inequality is 
non-Shannon in the sense that it cannot be deduced from 
inequalities of the form 

H{XiSY) > H{X) 

I{X]Y\Z) > 0. 

Using that I {X;Y) = H (X) -|- H (Y) and 

IiX-,Y\Z) = 

H {X IS Z) + H {Y Z) - H {X iSY \i) Z) - H (Z) 

the last inequality can be rewritten as 

H {X S Z) + H {Y S Z) < H {X SY S Z) + H (Z), 

which we will call the sub-modular inequality. Therefore 
the Shannon inequalities are the ones that can be deduced 


from using that entropy is non-negative, increasing and 
submodular. Later it was shown by F. Matus [2] that for 
four variables there exists infinitely many non-Shannon 
inequalities. It is easy to show that any inequality involv¬ 
ing only three variables rather than four can be deduced 
from Shannon’s inequalities. Now the power set of for 
variables is a Boolean algebra with 16 elements and any 
smaller Boolean algebra corresponds to smaller number of 
variables, so in a trivial sense the Boolean algebra with 16 
elements is the smallest Boolean algebra for which there 
exists non-Shannon inequalities. 

In the literature on non-Shannon inequalities all in¬ 
equalities are expressed in terms of sets of variables and 
their joins. Another way to formulate this is that the 
inequalities are stated for the free l±l-semilattice generated 
by a finite number of variables. In this paper we will also 
consider intersection of variables. We note that for sets of 
variables we have the inequality 

I iX;Y \ Z) > H {X nY \ Z). 

This inequality have even inspired some authors to see the 
notation / (• A •) to denote mutual information. 

Although non-Shannon inequalities have been known 
for more than a decade they have found remarkable few 
applications compared with Shannon’s inequalities. One of 
the reasons for this is that there exists much larger lattices 
that a Boolean algebra with 16 elements. The simplest 
example is are the Markov chains. 

Xi s X2 —^ A3 ^ Xn 

where Xi determines X 2 which determines A 3 etc. For 
such a chain one has 

H (Ai) > H (A 2 ) > H (A 3 ) > - >H (A„) > 0. 

These inequalities are all instances of monotonicity of the 
entropy function, and it is quite clear that these inequal¬ 
ities are sufficient in the sense that for any sequence of 
values that satisfies these inequalities there exists random 
variables related by a deterministic Markov chain with 
these values as entropies. 

In this paper we look at entropy inequalities for random 
variables that are related by functional dependencies. 
Functional dependencies gives an ordering of variables 
into a lattice. Such functional dependence lattices have 


many applications in information theory, but in this short 
note we will focus on the question how one can detect 
whether a lattice of functionally related variables can non- 
Shannon inequalities. In particular we are interested in 
determination of the “smallest” lattice with non-Shannon 
inequalities. Here we should note that there are several 
ways of measuring the size of a lattice, and also note that 
in order to achieve interesting results have have to restrict 
our attention to special classes of lattices. 

Non-Shannon inequalities have been studied using ma- 
troid theory but matroids are equivalent to atomistic semi- 
modular lattices. For the study of non-Shannon inequal¬ 
ities it is more natural to look at general lattices rather 
than matroids because many important applications in¬ 
volve lattices that are not atomistic or not semimodular. 
For instance the deterministic Markov chain gives a lattice 
that is not atomistic. It is known that a function is entropic 
if and only if it can (approximately) equal to the logarithm 
of the index of a subgroup in a group. The lattice of 
subgroups of a Abelian group is modular and atomistic 
and can be described by matroid theory. A switch from 
matroids to lattices corresponds to a switch from Abelian 
groups to more general groups. 

II. Lattices of functional dependence 

Many problems in information theory and cryptography 
can be formulated in terms functional dependencies. For 
instance one might be interested in giving each member 
of a group part of a password in such a way that no 
single person can recover the whole password but any 
two members are able to recover the password. Here the 
password should be a function of the variables known by 
any two members but not a function of a variable hold 
by any single member. In this section we shall briefly 
describe functional dependencies and their relation to 
lattice theory. The relation between functional dependence 
and lattices has previously been studied [3] , [1] , [S] , [S] ■ The 
relation between functional dependencies and Bayesian 
networks is described in [7]. 

Inspired by Armstrong’s theory of relational databases 
[5] we say that a relation in a lattice L satisfies 
Armstrong’s axioms if it satisfies the following properties. 

Transitivity li X and Y ^ Z, then X ^ Z. 

Reflexivity If A > T, then X ^Y. 

Augmentation If A —)• T, then XM Z ^Y M Z. 

In a database A —>■ A should mean that there exists 
a function such that Y = / (A) obviously satisfies these 
inference rules so as an axiomatic system it is sound. 
Armstrong proved that these axioms form a complete 
set of inference rules . That means that if a set A of 
functional dependencies is given and a certain functional 
dependence x ^ y holds in any database where all the 
functional dependencies in A hold then x ^ y holds in 
that database. Therefore for any functional dependence 
X ^ y that cannot be deduced using Armstrong’s axioms 
the exist a database where the functional dependence is 


violated [9] , [TO] . As a consequence there exists a database 
where a functional dependence holds if and only if it can 
be deduced from Armstrong’s axioms. A lattice element 
A is said to be closed if A —>■ A implies that A > A. The 
smallest lattice element greater than A will be denoted 
d (A). 

Theorem 1. The set of closed elements form a lattice. For 
any finite lattice there exist a set of related variables such 
that the elements of the lattice corresponds to closed sets 
under functional dependence. 

According to the theorem any lattice can be considered 
as a closed set of variables under some functional depen¬ 
dence relation where A A if and only if A A A. On the 
set of closed sets the meet operation is given by A n A 
and the join operation is given by A l±l A = cl (A U A). 
We observe that the set of closed sets is a subset of the 
original lattice that is closed under intersection. Such a 
subset is called a fl-semilattice or a closure system. Any 
closure system defines a relation that satisfies Armstrong’s 
axioms. 

On a lattice submodularity of a function h is defined via 
the inequality h (A) -|- h (A) > h (A W A) -f ft, (A O A). A 
polymatroid function on a lattice can then be defined as a 
function that is non-negative, increasing and sub-modular. 
The relation ft (A O A) -f ft (A 1+1 A) = ft (A l±l A O A) -f 
h{Z) defines a relation denoted (ATA | Z) that satisfies 
the properties: 

Existence (ATA | A). 

Symmetry (ATA | W) if and only if (ATA | W). 

Decomposition If (ATA W Z \ W) then (ATA | W). 

Contraction (ATA | W) and (ATA | A W W) implies 
(ATA W A I IF). 

Weak union (ATA O A | W) implies (ATA | A O W). 

We say that a relation that satisfies these properties 
is semi-graphoid. Note that we allow the elements to 
have non-empty intersection. Note also that the existence 
property is normally not included in the list of semi- 
graphoid properties. If {BFB \ A) we write A B. If 
ft denotes the Shannon entropy then A B simply 
means that iJ (R | H) = 0 or equivalently that B is almost 
surely a function of A. 

Theorem 2. If (L,n, l+l) is a lattice with a semi-graphoid 
relation (-T- | ■) then the relation T_i_ satisfies Armstrong’s 
axioms. The relation (-T- | •) restricted to the lattice of 
closed lattice elements is semi-graphoid. If the semi- 
graphoid relation (-T- | •) is given by a polymatroid function 
ft then ft is also polymatroid if it is restricted the lattice of 
closed elements. 

III. Entropy in functional dependence lattices 

Definition 3. A polymatroid function ft on a lattice L is 
said to be entropic if there exists a function / form L into 
a set of random variables such that ft (x) = H {f (x)) for 
any element in the lattice. 


Let L denote a lattice and let L (L) denote the set of 
polymatroid functions on L. Let L* (L) denote the set of 
entropic functions on L and let L* (L) denote the closure 
of this set. 

Definition 4. A lattice is said to be a Shannon lattice if 
any polymatroid function can be realized approximately 
by random variables, i.e. L (L) = L* (L). 

Both r (L) and L* (L) are polyhedral sets and often we 
may normalize the polymatroid functions by requiring that 
the value at the maximal element is 1. One may then check 
whether a lattice is a Shannon lattice by checking that the 
extreme polymatroid functions are entropic. 

From the definition we immediately get the following 
result. 

Proposition 5. IfL is a Shannon lattice and M is a subset 
that is a Ci-semilattice then M is a Shannon lattice. In 
particular all sub-lattices of a Shannon lattice are Shannon 
lattices. 

With these results at hand we can start hunting non- 
Shannon lattices. We take a lattice that may or may not 
be a Shannon lattice. We find the extreme polymatroid 
functions and for each extreme point we determine the 
lattice of closed elements using Theorem [2] Each of these 
lattices of closed sets have a much simpler structure than 
the original lattice and the goal is now to check if these 
lattices are Shannon lattices or not. It turns out that there 
are quite few of these reduced lattices and they could be 
considered as the building blocks for larger lattices. 

The simplest lattice just has just two elements. The only 
normalized polymatroid function takes the values zero and 
one. It is obviously entropic. 

We recall that an element i is W-irreducible if i = xitiy 
implies that i = x ot i = y. An n-irreducible element is 
defined similarly. An element is double irreducible if it is 
both W-irreducible and fl-irreducible. The lattice denoted 
Mn is a modular lattice with a smallest element, a largest 
element and n — 2 double irreducible elements arranged 
in-between. 



Figure 1. Hasse diagram of the lattice M7. 

Theorem 6. For any n the lattice Mn is a Shannon lattice. 

Proof: The proof is essentially the same as the solution 
to the cryptographic problem stated in the beginning of 
Section ini The idea is that one should look for groups with 


a subgroup lattice M„ and then check that the subgroups 
of such group are actually have the right cardinality. ■ 

Corollary 7. Any polymatroid function that only takes the 
values 0 , 1 / 2 , and 1 is entropic. 

Proof: Assume that the polymatroid function h only 
takes the values 0 , 1 / 2 , and I. Then h defines a semi- 
graphoid relation and the closed elements form a lattice 
isomorphic to M„ for some integer n. The function h is 
entropic on so h is also entropic on the original lattice. 



The Boolean lattice with four atoms is the smallest non- 
Shannon Boolean algebra. Nevertheless there are smaller 
non-Shannon lattices. Figure [2] illustrates a lattice with 
just II elements that violates Inequality [T] This corre¬ 
sponds to the fact that the lattice in Figure[2]is not equiva¬ 
lent to a lattice of subgroups of a finite group. The lattices 
that are equivalent to lattices of subgroups of finite groups 
have been characterized but the characterization is 
too complicated to describe in this short note. Using the 
ideas from [2] one can prove that the lattice in Figure [2] 
has infinitely many non-Shannon inequalities. Note that 
this lattice is atomistic but not semimodular, but it is 
lower locally distributive. Any semimodular lattice that 
contains the lattice in [ 2 ] as a fl-semilattice also contains a 
power set on four points as a fl-semilattice. The following 
lemma gives a considerable reduction in the number of 
inequalities that one has to consider in search for extreme 
polymatroid functions. 

Lemma 8. If h is submodular and increasing on f- 
irreducible elements then h is increasing. 

Theorem 9. The lattice in figure [H is the lower locally 
distributive non-Shannon lattice with fewest elements. 

Proof: There exists a nice presentation of lower locally 
distributive lattices [12] (In this paper the author works 
with the dual lattices). With this representation one it 
is relatively simple to create a list of all lower locally 
distributive lattices with II elements or fewer. Each lattice 
has finitely many extreme polymatroid functions. These 
can be found using the R program with package redd. 







Each of these extreme polymatroid functions in each of 
these lattices has been checked to be entropic. ■ 

One may ask if there exists a lattice with fewer points 
than 11 that is non-Shannon. 

Theorem 10. Any lattice with 1 or fewer elements is a 
Shannon lattice. 

Proof: Up to isomorphism there only exist finitely 
many lattices with 7 or fewer elements. Each lattice has 
finitely many extreme polymatroid functions. These can 
be found using the R program with package redd. Each 
of these extreme polymatroid functions in each of these 
lattices has been checked to be entropic. ■ 

IV. INGLETON INEQUALITIES 

The polymatroid function on Figure [H does not only 
violate some non-Shannon inequalities, but it also violates 
an Ingleton inequality m- The Ingleton inequalities are 
inequities of the form 

h {C)+h {D)+h (A W C l±) D)+h (R W C W D)+h {A W B) 

< 

h{C D) + h{C ^ A) + h{C it) B) + h{D^S A) + h (l±)R). 

A more instructive way of formulating the Ingleton in¬ 
equalities is in terms of conditional mutual information. 

I{X;Y\Z)< 

I {X;Y \ Z it)V) + I iX;Y \ Z it)W) + I {V-,W \ Z). 

The Ingleton inequalities are satisfied for rank functions of 
representable matroid. In particular all entropic functions 
that can be described by Abelian groups satisfy the Ingle¬ 
ton inequalities. If a polymatroid on a lattice satisfies the 
Ingleton inequality the associated semi-graphoid relation 
satisfies the following property. 

Strong contraction If {X±Y\ZitSV) and 
{XYY \Zii)V) and {V±W \ Z) then {XYY \ Z). 

Like the Ingleton inequality strong contraction does 
not hold for all entropic polymatroid functions, but it 
does hold for most graphical models of independence like 
Bayesian networks. Recently it was demonstrated that 
strong contraction is essential for giving a lattice char¬ 
acterization of an certain system of inference rules for 
conditional independence [H]. In [13] strong contraction 
was used in conjunction with the following property. 

Strong union If {XYY \ Z) then (XYY \ Z l±) W). 

Strong union is a quite restrictive condition, but it does 
hold for Markov chains and other Markov networks. The 
entropy inequality corresponding to strong union is 

I {X]Y \ Z) < I {X]Y \ Z iSW). 

If a polymatroid function satisfies the strong union in¬ 
equality we get a significant reduction in the complexity 
of the problem. 

Computer experiments support the following conjecture. 


Conjecture 11. If a polymatroid function on a lattice 
satisfies the Ingleton inequalities and the strong union 
inequalities then the function is entropic. 

It is worth noting that in m the authors use a lattice 
technique that is slightly different from the one developed 
in the present paper. 

V. Distributive and modular lattices 

The power-set of four variables is a distributive lattice 
so one may ask if there exists any distributive lattice with 
non-Shannon inequalities without this Boolean lattice as 
sub-lattice. We recall that a lattice is said to be modular 
if a C & implies that 

a W (a; n 5) = (a W x) n & 

for any lattice element x. For modular lattices the follow¬ 
ing lemma gives a considerable reduction in the number 
of inequalities that one has to consider in the search for 
extreme points. 

Lemma 12. Let L he a modular lattice with a function h 
that is submodular on any suh-lattice with elements a, 6, afl 
b and a l±) & where a Cl b is covered by a and b. Then the 
function h is submodular on L. 

For a distributive lattice the order dimension equals the 
maximal number of W-irreducible elements (or maximal 
number of fl-irreducible elements) needed in a decomposi¬ 
tion of an element in the lattice. Distributive lattices may 
also be represented as ideals in partially ordered sets and 
the order dimension is also equal to the maximal anti-chain 
in the partially ordered set used in such a representation. 

Theorem 13 ((TSj). Let L be a distributive lattice. Then 
L can he embedded as a sub-lattice into the n-th power of 
a chain if and only if it has order dimension at most n. 



Theorem 14. A distributive lattice is Shannon if and only 
if the order dimension at most 3. 


The free distributive lattice with three generators is a 
lattice on with the property that any distributive lattice 
generated by three elements is isomorphic to a sub-lattice. 
The free distributive lattice with three generators has 18 
elements mi 45-46, Theorem 10] and is three dimensional. 
Therefore we get the following result. 

Corollary 15. Any distributive lattice with 3 generators 
is a Shannon lattice. 

With three generators one can also define the free 
modular lattice. This lattice has 28 elements m 46-47, 
Theorem 11] and by explicit calculations one can check 
that it is a Shannon lattice. 

Proposition 16. The free modular lattice with 3 genera¬ 
tors is a Shannon lattice. 

If we do not require that the lattice is modular (or 
belong to some other nice lattice variety) the result does 
not hold. The free lattice with three elements contain a 
sub-lattice isomorphic with the four dimensional Boolean 
algebra that is not a Shannon lattice. Therefore it would 
be interesting to know if there exists larger lattice varieties 
that the variety of modular lattices for which a free lattice 
with three generators in the variety is a Shannon lattice. 

Theorem 17. Any modular planar lattice is a Shannon 
lattice. 

The proof uses that it has it was recently proved 
that a planar modular lattices can be represented as a 
distributive lattice with a number of double irreducible 
elements added m as illustrated in Figure SI Each of the 
extreme polymatroid functions on a planar modular lattice 
corresponds to a complicated cryptographic protocol or 
secrecy sharing scheme. 
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Figure 4. A planar modular lattice. 










This appendix contains two sections with a more careful 
description of the relation between functional dependen¬ 
cies, lattices, semi-graphoid relations and polymatroid 
functions. The appendix also contains proofs of some of 
the theorems stated in the paper. In order to keep within 
this note short some proofs have been foreshortened or 
have been omitted. 

Appendix A 

The functional dependence lattices 

In this section we shall describe functional dependencies 
and relate it to lattice theory. Much of the terminology 
is taken from database theory. The relation between func¬ 
tional dependence and lattices has previously been studied 

[3]) [3], [5]. 

First we shall consider a set of attributes/variables Vi 
and subsets of this set of variables. Each attribute Vi takes 
values in some set Wi. The set of subsets is also called the 
power set and is ordered by inclusion. With this ordering 
the power set is a lattice with intersection and union as 
lattice operations. We note that the smallest element in 
the lattice is the empty set 0 and the largest element is 
the whole set. One consider a set of tuples (records) that 
all share the same attributes. A relation i? is a set of tuples 
and an assignment of a value in Wi to each attribute Vi. 
One may think of a relation as a function from tuples to 
the product space H If A and Y are sets of attributes 
we say that Y functionally dependence on X in the relation 
R and write A T if IljeA: (^i) = OisA (^ 2 ) implies 

riiev (^ 1 ) = riiev ^ (^ 2 ) • 

Inspired by Armstrong’s theory of relational databases 
we say that a relation —>■ in a lattice L satisfies Armstrong’s 
axioms if it satisfies the following properties. 

Transitivity If A —>■ T and Y ^ Z, then A —>■ Z. 

Reflexivity If A >Y, then X ^ Y. 

Augmentation If A A, then X \i Z ^ Y V Z. 

Functional dependence —>■ in a database obviously sat¬ 
isfies these inference rules so as an axiomatic system it 
is sound. Armstrong proved that these axioms form a 
complete set of inference rules. That means that if a set A 
of functional dependencies is given and a certain functional 
dependence x ^ y holds in any database where all the 
functional dependencies in A hold then x ^ y holds in 
that database. Therefore for any functional dependence 
X ^ y that cannot be deduced using Armstrong’s axioms 
the exist a database where the functional dependence is 
violated 0,110]. As a consequence there exists a database 
where a functional dependence holds if and only if it can 
be deduced from Armstrong’s axioms. 

Theorem 18. A relation —> on the elements of a lattice 
satisfies Armstrong’s axioms if and only if^ is a preorder¬ 
ing that satisfies the following two properties. 

Decomposition If Z ^ X \J Y, then Z —> A and Z — > 

A. 

Union If Z ^ X and Z —>■ Y, then Z —>■ X V Y. 


Proof: Assume that —>■ satisfies Armstrong’s axioms. 
Then A > A implies A —>■ A so that —>■ is reflexive. 
To prove the union property assume that Z —> A and 
Z ^ A. Then ZVZ^AVZandAVZ-^AVA 
by augmentation. Then transitivity implies Z —>■ A V A. 
To prove the decomposition property assume that Z —> 
AV A. In the lattice we have AV A > A and by reflexivity 
AV A X. Now transitivity implies Z —)■ A. In the same 
way it is proved that Z —>■ A. 

Assume that is a preordering that satisfies decompo¬ 
sition and union. To prove reflexivity assume that A > A. 
Then A —>■ A V A, which according to the decomposition 
property implies A —>■ A. To prove augmentation assume 
that A —>■ A. We have A V Z —>■ A which together with 
transitivity implies A V Z —> A. By reflexivity we have 
A V Z —> Z.Therefore the union property implies that 
then AVZ^AVZ. ■ 

The first half of this proof was essentially given by 
Armstrong. 

Let L denote a lattice with a relation —>■ such that 
Armstrong’s axioms are satisfied. For simplicity assume 
that L is finite. For X £ L define cl (A) as \J Yi where 
the join is taken over all Yi such that X ^ Yi. The 
union property implies that cl (A) is maximal in the set of 
variables determined by A. With these definitions we see 
that A —>■ A if and only if cl (A) > cl{Y). For a relation 
that satisfies Armstrong’s axioms the unary operator cl 
satisfies the following conditions: 

Extensivity X < cl {X). 

Isotony A < A implies cl (A) < cl (Y). 

Idempotens cl{X) = cl {cl (A)). 

An unary operator that satisfies these three properties 
is called a closure operator. We say that A is closed if 
cl{X) = X. If A and A are closed for some closure 
operator cl then it is easy to prove that A A A is closed 
m Lemma 28]. A subset of a lattice that is closed under 
the meet operation, is called a semi-lattice or a closure 
system. In m closure systems were studied in more detail 
in the case where the lattice is a power set. The elements 
of the closure system are closed elements under the closure 
operator defined by cl {£) = Ax>t xeA 

Proposition 19. Let (L, <) denote a finite lattice. Assume 
that a subset A of L is closed under the meet operation. 
Then A is a lattice under the ordering < . 

Proof: The set A is partially ordered by < so we just 
have to prove that any pair of elements in A has a least 
upper bound and a greatest lower bound. The greatest 
lower bound of x,y £ A is x A y. The least upper bound 

of a: and y is A,,vy<z.zeA^- ■ 

The lattice operations in A are given by AAaA = AAA 
and A Va Y = cl (A V A). In particular the closed 
elements of a functional dependence relation form a lattice, 
and this was essentially the main result of Armstrong 
although he did not use lattice terminology. The converse 
of Armstrong’s results is also true: 


Theorem 20. Let {L,<) denote a finite lattice with a 
closure system A. Then the relation x ^ y is defined by 
cl (x) > cl (y) satisfies Armstrong’s axioms. 

Proof: It is easy to see that —)■ defines a 
preordering. The union property is proved as 
follows. Assume that x ^ y and x —> z. Then 
cl(x) >cl(y) >y and cl (x) > cl (z) > z.Hence cl{x) > yW z 
and cl (x) > cl{yV z) so that x —>■ y V z. The 
decomposition property is proved by reversing the 
argument. ■ 

The theorem as it is formulated here probably appear 
somewhere in the literature on lattices although the author 
has not been able to locate a good reference. 

Example 21. We consider three variables a, b and c that 
denote real numbers. Assume that c = {a + b)^ . Then the 
associated lattice is the lattice that is normally called Sr. 
This is illustrated in Figure [51 





Figure 5. To the left and influence diagram for three variables 
is drawn with arrows indicating the direction of influence. To the 
right the Hasse diagram for the corresponding lattice of functional 
dependence is drawn with the smallest element (0) indicated. The 
name of this lattice is S 7 . 

Even simple examples of functional dependence lattices 
may be complicated to describe if they are not based on 
simple causal relations between the variables. 

Example 22. This example concern fruit from a super¬ 
market. Variable X tells whether the supermarket will sell 
it at normal price, or at a reduced price because it is 
close to the expiration date, or whether it is through out 
because the expiration date has been exceeded. Variable 
Z describes whether the fruit tastes very fresh, is eatable, 
or looks disgusting. The variable Y tells whether the fruit 
will make you sick or not. The functional dependencies are 
given by ^ C V and VttlV = VttlZ. The lattice is V 5 . This 
is the standard example of a lattice that is not modular. 

Theorem 23. Any finite lattice can be represented as a 
closure system on a power set 

Proof: Let L be a lattice. For each a G L the principle 
ideal of a is 4 , (a) = {x G L \ x < a}. This gives an 



Figure 6. Hasse diagram of the lattice N^. 


embedding of L into the power set of L in such a way 
that meet in the lattice corresponds to intersection in the 
power set. ■ 

As a result any lattice is equivalent to a lattice of 
functional dependence, so all what can be said about 
functional dependence can be expressed in the language of 
lattices. Most of the time we will formulate our results in 
terms of closure systems. Since the notation for inclusion 
and intersection is fixed, we will use A to denote the 
ordering of a functional dependence lattice and n to denote 
the meet operation. If the lattice is the whole power set, 
i.e. a Boolean lattice then we will use U to denote the 
join operation. If we have not assumed that the lattice is 
Boolean we may use V or l±) or U or some similar symbol 
to denote the join operation. 

With the above results we can prove that Armstrong’s 
axioms form a complete set of inference rules for functional 
dependencies. 

Theorem 24. For any finite lattice there exist a set 
of related variables such that the elements of the lattice 
corresponds to closed sets under functional dependence. 

Proof: A lattice can be represented as a closure system 
of a power set of some set I. To each element i G I we 
associate a binary variable Vj with values in Wi = {0, 1 } . 
Let C denote the closed sets in the power sets. For each 
cGC we assign an tuple tc so that 

Vi {tc) = I if * e c. 

Assume that a A 6 . If (V {tcf))i^a = (^C 2 ))jga 

tuples 4 , then (V {tcj)^^^ = {Vi for holds for all 

tuples tcj ■ Hence a ^ b. 

Assume that a —> b. According to the definition it 
means that if {V = {Vi {tc 2 ))iea fo'’ tuples ^ 

then {Vi{tci))i(zi, = {Vi{tc 2 ))i^b fo^ for all tuples 

tc^. Assume that (V(tci))iga = (Vi {tc 2 ))iea ■ Then for 
all * £ a we have Vj (tci) = Vi (tc 2 ) which is equivalent 
to a n Cl = a n C 2 . Similarly (14 (<ci))igb = (foi(tc 2 ))igb 
is equivalent to 6 D ci = 6 D C 2 . Choose ci = a and 






C 2 = a 1+1 6. Then a fl ci = a fl C 2 is automatically fulfilled 
and 6nci = br\C 2 can be rewritten as bCia = 6n(a l±) 6) = b, 
which implies that a ^ b. ■ 

Appendix B 

Independence in lattices 

In statistics one studies the relation (A_Li? | C) meaning 
that A and B are independent given sC, where A, B and 
C are disjoint subsets of a set M of random variables with 
respect to a probability measure. We will call this notion 
of independence statistical independence. 

We shall say that a relation (-T- | ■) on a lattice {L, D, W) 
is a semi-graphoid relation, if it satisfies the following 
axioms: 

Existence {XIY | A). 

Symmetry {XIY \ W) if and only if (Y±X \ W). 

Decomposition If {XIY ttJ Z \ W) then {XAZ \ W). 

Contraction {X±Z | W) and {XYY \ Z l+l W) implies 
{XAY \iiZ\W). 

Weak union {XIY l±l Z \ W) implies {XIY \ Z l±l W). 

These properties should hold for all X,Y,Z,W S L. 
We note that statistical independence with respect to a 
probability measure is semi-graphoid. In this paper we are 
particularly interested in the case where the subsets are 
not disjoint. A relation that satisfies the last for properties 
for disjoint sets in a power was said to be semi-graphoid 
[?]. In a recent paper EO] a much longer list of axioms 
for the notion of independence was given. Most of those 
axioms can be proved from the axioms stated in this paper. 

Theorem 25. A semi-graphoid relation (-T- | •) satisfies 
the following properties. 

Reflexivity For all A we have (ATA | A). 

Normality If (ATT | W) then {XYY W IT | IT). 

Monotonicity If (ATT | IT) and T ttJ IT T Z then 
(ATZ I IT). 

Triviality (AT0 | T) 

Base monotonicity If {AAB \ D) and B A C A D 
then {AAB \ C). 

Transitivity If {AAB \ C) and {A±C \ D) and B A 
CAD then {AAB \ D). 

Autonomy If {AAA \ C) then {AAB \ C). 

In a power set of random variables we note that if A is 
independent of A given C then A is a function of C almost 
surely. If {BAB \ A) we write A Tj_ B. 

Definition 26. An semi-graphoid relation is said to be 
consistent with C if A C T is equivalent to (ATA | T). 

Theorem 27. If {L, n, W) is a lattice with a semi-graphoid 
relation (-T- | ■) then the relation T_l satisfies Armstrong’s 
axioms. The relation (-T- | •) restricted to the lattice of 
closed lattice elements is semi-graphoid. 

Proof: Reflexivity of This follows according to 
the reflexivity property of T. 

Transitivity Assume that A Aj_ Y and T Tj_ Z. 
Autonomy implies that {ZAZ ttJ A | T) and by weak 


union (ZTZ|Tl+lA). Autonomy and A Tj_ T to¬ 
gether imply that {YAZ | A). Contraction then implies 
{ZAY l±l Z I A). Decomposition gives {ZAZ | A). 

Decomposition This follows from the decomposition 
property of T. 

Union Assume that A T_i_ T and A T_i_ Z. 
Then (TTT | A) and {ZAZ | A) and by 
autonomy (TTT ttJ Z | A) and (ZTT ttJ Z l+l T | A). 
Hence (ZTT l+l Z | T W A) by weak union and 
(T W ZTT l±l Z I A) by contraction. 

For the last result one just has to prove that 
(ATT I Z) if and only if (ATc1_l (T) | Z) if and only if 
(ATT I cl± (Z)). This follows from Armstrong’s results. 

■ 

The significance of this theorem is that if we start with a 
semi-graphoid relation on a lattice then this semi-graphoid 
relation is also semi-graphoid when restricted elements 
that are closed under functional dependence. 

Theorem 28. Any finite lattice can be represented as a 
closure system of an semi-graphoid relation defined on a 
power-set. 

Proof: For any finite lattice L one identify the ele¬ 
ments by sets of binary variables Vi, and a relation can 
be defined where the tuples have the form ic,c € L as 
in the proof of Theorem Each tuple can be identified 
with a point in the product space J)[ IT ■ Assign a uniform 
distribution to each point in the product space. With 
respect to this probability measure {bAb \ a) if and only if a 
determines b almost surely. Since the probability measure 
is discrete {bAb \ a) is valid if and only ii a A b. ■ 

The semi-graphoid relation defined in the proof of the 
previous theorem is based on the uniform distribution on 
the tuples. We note that any other distribution that has 
positive probability on the same tuples will also give a 
representation of the lattice in terms of a semi-graphoid 
relation. For disjoint sets independence will depend on the 
choice of probability distribution. 

Appendix C 

Proof of Proposition O 

Assume that T is a Shannon lattice and that M is a sub¬ 
lattice. Let h : M —>■ M. denote a polymatroid function. For 
£ € L let £ denote the m € M that minimize h {m) under 
the constraint that m A £. Define the function h {£) = 
h (!) . Now h is an extension of h and with this definition 
h is non-negative and increasing. For x,y € L we have 

h{x) Ah {y) = h{x)Ah {y) 

> h{xAiy) Ah {xAy) 

> h{xAiy) Ah {x Ay) 

because x l+l 1/ > x A y and x A y > x A y. Hence h 
is submodular. By the assumption h is entropic so the 
restriction oi h to M is also entropic. 


Appendix D 
Proof of Lemma [S] 

Assume that h is submodular and increasing on D- 
irreducible elements. We have to prove that if a A c 
then h (a) > h{c). In order to obtain a contradiction 
assume that c is a maximal element such that there exist 
an element a such a A c but h[a) < h (c). We may 
assume that a cover c. Since h is increasing at fl-irreducible 
elements c cannot be fl-ireducible. Therefore there exists 
a maximal element b such that 6 A c but b ^ a. Since a 
cover c we have a Clb = c. According to the assumptions 
h{a) + h{b) > /i (a l±) 6 ) + h (a n b) and h (a W 6 ) > h (b) 
because c is a maximal element that violates monotonicity. 
Therefore h{a) > h {a Db) = h (c). 

Appendix E 
Proof of Theorem [B] 

Let the values in the double irreducible elements be 
denoted Xi,X 2 , ■ ■ ■ ,Xn- 2 - If n = 1 the extreme poly- 
matroid functions are Xi = 0 and xi = 1 and these 
points are obviously entropic. If n = 4 the extreme 
points are (xi,X 2 ) = ( 0 , 1 ) and (xi^Xx) = ( 1 , 0 ) and 
{xi,X 2 ) = ( 1 , 1 ),which are all entropic. 

Assume n > 5. Then the values should satisfy the 
inequalities 

0 < Xi <1 
Xi + Xj > 1. 

If (xi, X 2 , ■ ■ ■, Xn- 2 ) is an extreme point then each vari¬ 
ables should satisfy one of the inequalities with equality. 
Assume Xi = 0. Then sub-modularity implies that xj = 1 
for j ^ i. The extreme point (0,0,..., 0,1,0,..., 0) is ob¬ 
viously entropic. If = 1 this gives no further constraint 
on the other values, so it corresponds to an extreme point 
on a lattice with one less variable. Finally assume that 
Xi + Xj = 1 for all i,j. Then Xi = 1/2 for all i. 

We have to find n —2 random variables Ai, A 2 ..., Xn -2 
that are independent but such that any two determine the 
rest. Let p denote a prime larger than n — 2. Let Y and 
Z denote independent random variables with values in hp 
each with a uniform distribution. If Xj is defined to be 
equal to Y + jZ then the variables Xj are mutually inde¬ 
pendent and any pair of these random variables determine 
all the other variables. 

Instead of constructing the variables Ai, A 2 ,..., A „_2 
we can find a group G and subgroups 
Gi, G 2 ,..., G'n- 2 such that |G| = 2|Gi| using general 

results about entropy inequalities and groups [21]. The 
group G can be chosen as hp x Zpwhere p is some prime 
number greater than n — 2. The group G has p + 1 
subgroups isomorphic to Zp. 

Appendix F 
Proof of Lemma da 

Let a and b denote two lattice elements. We have to 
prove that h{a) + h{b) > h {a^ b) + h (a Db). 


Assume that xi,X 2 , ■ ■ ■ ,Xn is sequence of elements such 
an 6 Ca:Ca; 2 C---Ca;„Ca. Define yi = Xi^±) b. Then 
modularity implies 


Xi+i n Pi 


Xi+i n (& w Xi) 

{xi n 6) i±) {xi+i n Xi) 

(a n 6) l±) Xi 

Xi. 


We also have 


Xi+i W Pi 


Xi+i UJ {h\iiXi) 
{xi+i W Xi) W & 
Xi+i W b 

y^+l■ 


Assume that the modular inequality holds for all the sub¬ 
lattices Li = {xi,Xi+i, 2 /i,?/i+i} . Then we can add all 
the inequalities h(xi+i) + h{yi) < h(xi) -\- h(]ji+i) to 
get h (xn) + h (j/i) < h (xi) -I- h (pn) ■ Note that we can 
choose the sequence Xi,X 2 ,... ,x„ such that Xj+i covers 
Xi and such that xi = aHb and x„ = a. Therefore we it is 
sufficient to prove that h{a) + h{b) > h{a\t)b) + h{ari b) 
when a cover a H 6. 

Similarly it is sufficient to prove that h (a) h(h) > 
h (a W &) + h{ar\h) when b cover a fl &. If a and b both 
cover a n & then M = {a, b, a Clb, a ^b} is a sub-lattice of 
x’*' if X = a n 6 . 


Appendix G 
Proof of Theorem [HI 

If the lattice is one-dimensional we just get a determinis¬ 
tic Markov chain for which positivity and monotonicity are 
sufficient conditions for a function to be entropic. Assume 
that the lattice is two-dimensional. 

We will show that an extreme polymatroid function only 
takes the values 0 and 1. Assume that {A, <) the poset of 
irreducible elements in the distributive lattice. The proof 
is by induction on the number of elements k in the lattice. 
If fc = 2 this is obvious. Assume that it has been proved for 
all distributive lattices where k < n and let L be a lattice 
with n -I- 1 elements. We note that a distributive lattice 
is modular. Therefore it is sufficient that the sub-modular 
inequality is satisfied on sub-lattices of the form x+. We 
know that he lattice is sub-lattice of a product of two 
totally ordered lattices. Such a product lattice is planer in 
the sense that it has a Hasse diagram without intersection 
lines. The Hasse diagram consists of small squares each 
representing a sub-lattice of the form x’*'. 

Now consider a polymatroid function h on the lattice. 
Assume that h is an extreme point in the set of all 
polymatroid functions. For each point in the lattice the 
value is constrained by a number of inequalities and since 
we have assumed that the function is an extreme point at 
least one of the inequalities holds with equality. We start 


at the double irreducible element b. It is contained by two 
monotone inequalities and one submodular inequality. 

h{a) < h (b) 
h(b) < h (d) 

h{a) + h (d) < h{b) + h (c). ( 2 ) 

The submodular inequality implies that h{b) > h (a) + 
{h{d) — h{c)) which a stronger condition than the first 
monotone condition. Therefore the conditions on j/iare 

h{a) + h (d) — h{c) < h{b) < h (d). (3) 

Observe that h{a) + h{d) — h{c) < h{d). Since both 
the lower bound on h (b) and the upper bound on h (b) 
are linear any extreme polymatroid is also an extreme 
polymatroid when it is restricted to the lattice where the 
element b has been removed. According to the induction 
hypothesis such an extreme polymatroid function only 
takes the values 0 and 1. Therefore if the polymatroid 
function on the original lattice is extreme one of the 
inequalities in ([3]) must hold with equality and therefore 
h (b) = 0 OY h {b) = 1. is entropic. 

Since an extreme polymatroid function only takes the 
values 0 and 1 the lattice generated by the polymatroid 
function has only two elements and this is obviously 
entropic. 

If the lattice is three dimensional one have to mod¬ 
ify the above procedure. A three dimensional distribu¬ 
tive lattice may not have any double irreducible el¬ 
ements. If a single element is deleted from the lat¬ 
tice it is no longer modular, but modularity is needed 
if we should use Lemma m Instead one consider 
sequences (oi, 02 ■ ■ .a„), ( 61 , 62 ■ ■ - bn), (ci,C 2 , ■■■Cn), and 
(di, ^2 • ■ • dn) with the conditions 



Figure 8. The upper right corner of the lattice. 

One can then prove that any extreme polymatroid func¬ 
tion only takes the values 0 , 1 / 2 , and 1 by a more compli¬ 
cated induction argument. One can then use 0 to include 
that any extreme polymatroid function 

Appendix H 
Proof of Theorem [T7] 

We use that it has it was recently proved that a planar 
modular lattices can be represented as a distributive lat¬ 
tice with a number of double irreducible elements added 
m- The proof has the same structure as for distribu¬ 
tive lattices, but the existence of the double irreducible 
elements implies that there are also extreme polymatroid 
functions that are proportional to the ranking function. 
Let Xi,X 2 ,... Xm, Yi,Y 2 , ... ,Yn denote independent ran¬ 
dom variables uniformly distributed over Zp for some large 
value of p. Let Zij denote the random variable 

y Xf w y Yi. 

i<i t<j 

and let denote the random variable 

y l±) y 1+1 {Xi+i k ■ Yj+i) 

l<i t<i 

for fc > 0. The way to index the variables can be seen in 
Figure 0 Then the entropy is proportional to the ranking 
function. 


h {aj) + h (dj) - h (cj) < h {bj) < h {dj) 
h (dj+i) — h (dj) < h ( 6 j+i) —h (bj) < h (bj+i) — h (bj ). 



Figure 7. A planar modular lattice with indexing of the elements. 










